Generating Discourse Structures for Written Text

نویسندگان

  • Huong Lê Thanh
  • Geetha Abeysinghe
  • Christian R. Huyck
چکیده

This paper presents a system for automatically generating discourse structures from written text. The system is divided into two levels: sentence-level and text-level. The sentence-level discourse parser uses syntactic information and cue phrases to segment sentences into elementary discourse units and to generate discourse structures of sentences. At the text-level, constraints about textual adjacency and textual organization are integrated in a beam search in order to generate best discourse structures. The experiments were done with documents from the RST Discourse Treebank. It shows promising results in a reasonable search space compared to the discourse trees generated by human analysts.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Building a Japanese Corpus of Temporal-Causal-Discourse Structures Based on SDRT for Extracting Causal Relations

This paper proposes a methodology for generating specialized Japanese data sets for the extraction of causal relations, in which temporal, causal and discourse relations at both the fact level and the epistemic level, are annotated. We applied our methodology to a number of text fragments taken from the Balanced Corpus of Contemporary Written Japanese. We evaluated the feasibility of our method...

متن کامل

Towards Generating Text from Discourse Representation Structures

We argue that Discourse Representation Structures form a suitable level of languageneutral meaning representation for micro planning and surface realisation. DRSs can be viewed as the output of macro planning, and form the rough plan and structure for generating a text. We present the first ideas of building a large DRS corpus that enables the development of broad-coverage, robust text generato...

متن کامل

Transmission of Ideology through Translation: A Critical Discourse Analysis of Chomsky’s “Media Control” and its Persian Translations

Among factors that might manipulate translators’ mind while producing a text is the notion of ideology transmission through text or talk. Adopting Critical Discourse Analysis (CDA) with particular emphasis on the framework of Van Dijk (1999), the present investigation is an attempt to shed light on the relationship between language and ideology involved in translation in general, and more speci...

متن کامل

Machine Translation , Ten Years On : Discourse has yet to make a breakthrough

As already mentioned, most machine translation systems perform translation sentence by sentence. But even in the case of paragraph translation, the discourse structure of the target text tends to be identical to that of the source text. However, the sublanguage discourse structures may differ across the different languages, and thus a translated text which assumes the same discourse structure a...

متن کامل

ITRI-99-15 Generating Embedded Discourse Markers from Rhetorical Structure

To understand a discourse, the reader needs to recover the relations between the discourse elements as intended by the writer. Writers can, and very often do, help the reader along by providing explicit lexical signals of the intended discourse relations through the use of lexicalised discourse markers. In this paper we present a novel approach for generating texts containing multiple, embedded...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004